A Reliability-Generalization Study of Journal Peer Reviews: A Multilevel Meta-Analysis of Inter-Rater Reliability and Its Determinants

نویسندگان

  • Lutz Bornmann
  • Rüdiger Mutz
  • Hans-Dieter Daniel
چکیده

BACKGROUND This paper presents the first meta-analysis for the inter-rater reliability (IRR) of journal peer reviews. IRR is defined as the extent to which two or more independent reviews of the same scientific document agree. METHODOLOGY/PRINCIPAL FINDINGS Altogether, 70 reliability coefficients (Cohen's Kappa, intra-class correlation [ICC], and Pearson product-moment correlation [r]) from 48 studies were taken into account in the meta-analysis. The studies were based on a total of 19,443 manuscripts; on average, each study had a sample size of 311 manuscripts (minimum: 28, maximum: 1983). The results of the meta-analysis confirmed the findings of the narrative literature reviews published to date: The level of IRR (mean ICC/r2=.34, mean Cohen's Kappa=.17) was low. To explain the study-to-study variation of the IRR coefficients, meta-regression analyses were calculated using seven covariates. Two covariates that emerged in the meta-regression analyses as statistically significant to gain an approximate homogeneity of the intra-class correlations indicated that, firstly, the more manuscripts that a study is based on, the smaller the reported IRR coefficients are. Secondly, if the information of the rating system for reviewers was reported in a study, then this was associated with a smaller IRR coefficient than if the information was not conveyed. CONCLUSIONS/SIGNIFICANCE Studies that report a high level of IRR are to be considered less credible than those with a low level of IRR. According to our meta-analysis the IRR of peer assessments is quite limited and needs improvement (e.g., reader system).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Heterogeneity of Inter-Rater Reliabilities of Grant Peer Reviews and Its Determinants: A General Estimating Equations Approach

BACKGROUND One of the most important weaknesses of the peer review process is that different reviewers' ratings of the same grant proposal typically differ. Studies on the inter-rater reliability of peer reviews mostly report only average values across all submitted proposals. But inter-rater reliabilities can vary depending on the scientific discipline or the requested grant sum, for instance....

متن کامل

Evaluation of Spasticity Using the Ashworth Scale with Intermediate Scores (ASIS)

Objectives: The main purpose of this research was to study and contribute to an accurate test of spastic limb. The intra, inter rater reliability of the test was examined. Methods: The present study was carried out in two parts In the first part of the study, the modified Ashworth Scale with Intermediate Scores (ASIS) was studied. During the second part of the study the intra, inter rater re...

متن کامل

Test-Retest and Inter-Rater Reliability Study of the Schedule for Oral-Motor Assessment in Persian Children

Objectives: Reliable and valid clinical tools to screen, diagnose, and describe eating functions and dysphagia in children are highly warranted. Today most specialists are aware of the role of assessment scales in the treatment of affected individuals. However, the problem is that the clinical tools used might be nonstandard, and worldwide, there is no integrated assessment performed to assess ...

متن کامل

Functional Movement Screen in Elite Boy Basketball Players: A Reliability Study

Purpose: To investigate the reliability of Functional Movement Screen (FMS) in basketball players. A few studies have compared the reliability of FMS between raters with different experience in athletes. The purpose of this study was to compare the FMS scoring between the beginners and expert raters using video records.  Methods: This is a cross-sectional study. The study subjects compris...

متن کامل

Nurse-Physician Agreement on Triage Category: A Reliability Analysis of Emergency Severity Index

Background and Objectives: MThe Emergency Severity Index (ESI) triage is commonly used in clinical settings to determine the patients’ emergency severity. However, the reliability of this index is not sufficiently explored. The present study examines the inter-rater reliability of ESI by comparing triage ratings as performed by nurses and physicians. Methods: This prospective cross-sectional st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2010